Modified Method of Document Text Extraction from Document Images Using Haar DWT

نویسنده

  • Navjot Kaur
چکیده

This paper extends the technique used for Document Text Extraction from Images using 2-D Haar Wavelet. The discrete wavelet transform is a very useful tool for signal analysis and image processing, especially in multi-resolution representation. It can decompose signal into different components in the frequency domain. Two-dimensional discrete wavelet transform (2-D DWT) decomposes an input image into four sub-bands, one average component (LL) and three detail components (LH, HL, HH). The multiresolution of 2-D DWT has been employed to detect edges of an original image. We select an appropriate threshold value and preliminarily remove the non-text edges in the detail component sub-bands. Then we use the logical AND operator to further removes the non-text regions. Another idea of removing the large size area in the image is merged with this idea to eliminate the non-text region from Document Images. Keywords—Average component, Detail components, Document text, DWT, Multi-resolution of 2-D DWT, Non-Text Edges, Sub-band images, Text extraction, 2-D Haar Wavelet ——————————  ——————————

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Text Extraction of Vehicle Number Plate and Document Images Using Discrete Wavelet Transform in MATLAB

Text Extraction from colour images is a challenging task in computer vision. The concept of text extraction is derived from the vehicle plate recognization and their characters extractions individually. Some examples of the applications are automatic image indexing, visual impaired people assistance or optical character reading, keyword searching in a document image. The continuous research has...

متن کامل

Document Text Extraction from Document Images Using Haar Discrete Wavelet Transform

This paper presents an efficient and computationally fast method to extract text regions from documents. In this paper, we propose Haar discrete wavelet transform (DWT)[9] which operates the fastest among all wavelets because its coefficients are either 1 or -1. This is one of the reasons we employ Haar DWT to detect edges of candidate text regions. First, we detect edges and then line feature ...

متن کامل

Document Analysis And Classification Based On Passing Window

In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...

متن کامل

رفع اعوجاج هندسی متون به‌کمک اطلاعات هندسی خطوط متن

Document images produced by scanners or digital cameras usually have photometric and geometric distortions. If either of these effects distorts document, recognition of words from such a document image using OCR is subject to errors. In this paper we propose a novel approach to significantly remove geometric distortion from document images. In this method first we extract document lines from do...

متن کامل

Image Segmentation for Text Extraction

This paper presents a methodology for extracting text from images such as document images, scene images etc. Text that appears in these images contains important and useful information. Text extraction in images has been used in large variety of applications such as mobile robot navigation, document retrieving, object identification, vehicle license plate detection, etc. In this paper, we emplo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012